Kernel-Based Learning of Hierarchical Multilabel Classification Models
نویسندگان
چکیده
We present a kernel-based algorithm for hierarchical text classification where the documents are allowed to belong to more than one category at a time. The classification model is a variant of the Maximum Margin Markov Network framework, where the classification hierarchy is represented as a Markov tree equipped with an exponential family defined on the edges. We present an efficient optimization algorithm based on incremental conditional gradient ascent in single-example subspaces spanned by the marginal dual variables. The optimization is facilitated with a dynamic programming based algorithm that computes best update directions in the feasible set. Experiments show that the algorithm can feasibly optimize training sets of thousands of examples and classification hierarchies consisting of hundreds of nodes. Training of the full hierarchical model is as efficient as training independent SVM-light classifiers for each node. The algorithm’s predictive accuracy was found to be competitive with other recently introduced hierarchical multicategory or multilabel classification learning algorithms.
منابع مشابه
Abstract of " Multilabel Classification over Category Taxonomies " Multilabel Classification over Category Taxonomies Finally I Want to Specially Thank My Father
of “Multilabel Classification over Category Taxonomies” by Lijuan Cai, Ph.D., Brown University, May 2008. Multilabel classification is the task of assigning a pattern to one or more classes or categories from a pre-defined set of classes. It is a crucial tool in knowledge and content management. Standard machine learning techniques such as Support Vector Machines (SVMs) and Perceptron have been...
متن کاملMultilabel Classification Evaluation using Ontology Information
Multilabel classification using ontology information is an emerging research area that combines machine learning methods with knowledge models. The performance assessment of such classification systems poses new challenges. We propose an evaluation measure that considers the mapping of label sets to their groundtruth and allows for the incorporation of real world knowledge. A distance-based mea...
متن کاملOn Maximum Margin Hierarchical Multilabel Classification
We present work in progress towards maximum margin hierarchical classification where the objects are allowed to belong to more than one category at a time. The classification hierarchy is represented as a Markov network equipped with an exponential family defined on the edges. We present a variation of the maximum margin multilabel learning framework, suited to the hierarchical classification t...
متن کاملAdapting non-hierarchical multilabel classification methods for hierarchical multilabel classification
In most classification problems, a classifier assigns a single class to each instance and the classes form a flat (non-hierarchical) structure, without superclasses or subclasses. In hierarchical multilabel classification problems, the classes are hierarchically structured, with superclasses and subclasses, and instances can be simultaneously assigned to two or more classes at the same hierarch...
متن کاملHierarchical Multilabel Classification Trees for Gene Function Prediction (Extended Abstract)
Prediction of gene function is a so-called hierarchical multilabel classification (HMC) task: a single instance can be labelled with multiple classes rather than just one (i.e., a gene can have multiple functions), and these classes are organized in a hierarchy. Many machine learning methods focus on learning predictive models with a single target variable. One can then learn to predict all cla...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Journal of Machine Learning Research
دوره 7 شماره
صفحات -
تاریخ انتشار 2006